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ECHO,  Synthetic  Flight  Training  Programs  and  Devices, 
from  INTACT,  Integrated  Contact/Instrument  Training,  and 
from  SYNTRAIN,  Modernization  of  Synthetic  Training  in 
Army  Aviation.  Dr.  Wallace  W.  Prophet  is  the  Director  of 
Division  No.  6. 


PERFORMANCE  MEASUREMENT  IN  HELICOPTER 
TRAINING  AND  OPERATIONS 

Wallace  W.  Prophet 


The  past  two  decades  have  seen  a  tremendous  change  in  the  role  of  aviation  in  the 
U.S.  Army.  The  use  of  the  helicopter  has  given  a  dimension  and  degree  of  mobility 
heretofore  impossible  for  ground  forces.  An  idea  of  the  extent  of  growth  in  the  Army 
aviation  field,  can  be  gained  from  the  following  figures.  In  1950,  <he  Army  had  only  715 
aviators  and  1242  aircraft;  by  1900,  these  totals  had  risen  to  ;i984  and  5477,  respec¬ 
tively;  in  1970,  there  was  a  total  of  22,250  aviators  and  11,446  aircraft. 

This  is  truly  an  amazing  growth  and  reflects  the  great  utility  of  the  helicopter  in 
performing  a  wide  variety  of  airlift  roles.  The  versatility  of  these  sometimes  awkward 
looking  and  noisy  machines  undoubtedly  will  result  in  a  continuing  increase  in  their 
application  to  both  civil  and  military  needs.  For  example,  the  helicopter  is  already  being 
used  for  such  diverse  activities  as  transportation  of  persons;  oil  exploration;  patrolling  of 
forests,  game  preserves,  power  lines,  and  pipelines;  traffic  control  and  other  aspects  of 
law  enforcement;  medical  evacuation;  and  heavy  construction. 

This  great  increase  in  the  use  of  aircraft,  particularly  helicopters,  in  the  Army  has 
brought  with  it  a  tremendous  expansion  in  the  Army’s  flight  training  program.  For 
example,  for  fiscal  years  1966  through  1970,  the  Army  graduated  the  following  numbers 
of  initial  entry  pilots:  1966,  1869;  1967,  4257;  1968,  5295;  1969,  7699;  and  1970, 
7525. 

The  progress  of  training  for  the  initial  entry  student  has  been  approximately  as 
follows:  The  helicopter,  or  rotary  wing,  student  began  his  primary  training  (110  flight 
hours  over  16  calendar  weeks)  at  the  U.S.  Army  Primary  Helicopter  School,  Fort  Wolters, 
Texas.  He  then  moved  on  to  either  Hunter  Army  Air  Field,  near  Savannah,  Georgia,  or  to 
Fort  Rucker,  Alabama,  to  complete  the  remainder  of  his  training  (100  hours  over  16 
weeks).  Upon  graduation  from  this  sequence  of  training,  the  new  aviator  received  his 
wings  and  was  assigned  to  an  operational  unit,  usually  in  Vietnam.  The  fixed  wing  trainee 
received  a  similar  amount  of  instruction,  except  that  his  primary  instruction  (110  hours 
over  16  weeks)  was  given  at  Fort  Stewart,  Georgia,  and  he  then  moved  to  Fort  Rucker  to 
complete  his  training  (100  hours  over  16  weeks). 

As  in  any  educational  or  training  system,  measurement  plays  an  extremely  critical 
role  in  flight  training.  The  requirements  for  achieving  psychometrically  sound  perform¬ 
ance  measures  are  well  known.  However,  these  problems  are  vastly  compounded  when  the 
performance  measured  is  as  complex  as  that  required  in  flying.  Furthermore,  the  func¬ 
tional  uses  to  which  the  measurements  may  be  put  are  multifarious.  Overlying  these 
considerations  are  the  facts  that  aviation  training  is  very  expensive  and  adequacy  of 
performance  may  have  life  or  death  consequences  for  the  pilot  and  perhaps  others. 

HumRRO  has  just  completed  its  20th  year  as  a  research  and  development  organiza¬ 
tion  concerned  primarily  with  human  functioning  and  performance  in  the  world  of  work. 
For  the  last  15  of  those  20  years,  HumRRO  has  been  working  actively  on  problems 
related  to  aviation  training  and  flight  performance.  This  paper  will  describe  some  of  the 
work  relating  to  performance  measurement. 

An  examination  of  some  of  the  uses  to  which  flight  performance  measures  may  be 
put  will  provide  a  background  for  the  discussion  of  HumRRO’s  flight  performance 
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research  program.  Our  research  is  marked  by  emphasis  on  pragmatic,  utilitarian  aspects. 
First,  the  most  obvious  application  of  performance  measurement  to  the  individual  trainee 
is  the  determination  of  who  passes  and  who  fails.  Here,  performance  measures  refer  to 
measures  of  achievement  in  the  flight  training  program,  such  as  the  periodic  checkrides 
given  at  various  points  during  training.  Failure  to  perform  satisfactorily  results  in  elimina¬ 
tion  from  the  training  program.  However,  the  extent  to  which  these  measures  are 
predictive  of,  or  relevant  to,  future  pilot  performance  is  also  of  concern.  This  predictive 
function  is  of  special  importance  to  the  Army,  because,  unlike  his  fellow  pilots  in  the 
other  services,  the  Army  pilot  assumes  duties  in  an  operational  unit  (usually  in  combat) 
immediately  after  the  completion  of  his  undergraduate  pilot  training  (UPT).  In  contrast, 
the  Air  Force  UPT  graduate  goes  on  for  further  training  and  assessment  at  a  Combat 
Crew  Training  School,  while  the  graduate  of  Navy  UPT  goes  to  a  Replacement  Air 
Group  Squadron  for  such  work. 

Other  uses  of  flight  performance  measures  focus  on  the  individual  pilot  trainee, 
principally  by  the  individual  flight  instructor.  He  may  use  daily  flight  grades  or  perform¬ 
ance  records  as  a  basis  for  counseling  the  trainee  and  for  modifying  the  training 
presentation.  He  may  also  use  grades  in  an  attempt  to  motivate  the  student  through 
selective  reinforcement. 

The  great  majority  of  past  research  on  flight  performance  measurement  has  tended 
to  concentrate  on  the  use  of  such  measures  with  the  individual  student.  Hence,  we  have 
seen  much  research  dealing  with  reliability,  as  in  the  stability  of  trainee  performance  (or 
measures  of  his  performance)  from  one  occasion  to  another,  based  upon  the  classical 
test-retest  paradigm.  Studies  of  checkpilot  flight  standards  (i.e.,  interobserver  reliability) 
also  fall  in  this  area. 

Although  this  concern  with  the  individual  in  training  is  paramount,  over  the  past 
decade  flight  performance  measures  have  also  been  used  as  a  management  tool.  In  these 
functional  areas  fall  recruitment,  selection,  and  general  manpower  management.  Also,  we 
have  been  quite  concerned  with  applications  of  the  concept  of  quality  control  in  aviation 
training  systems.  Here,  the  focus  is  on  feedback  to  the  training  system  from  those 
external  operational  systems  with  which  it  interfaces,  as  well  as  with  feedback  loops 
entirely  within  the  training  sytem.  One  particularly  critical  feedback  loop  between  the 
training  system  and  the  criterion  world  of  flight  operations  involves  aviation  safety.  The 
goal  of  such  efforts  is  the  development  of  a  continuing,  dynamic  means  for  adjusting  and 
regulating  aviation  training  sytems.  All  applications  assume  sound  indices  of  performance. 

One  of  the  first  areas  of  aviation  psychology  investigation  undertaken  by  HumRRO 
was  helicopter  flight  oerformance  evaluation  methods.  In  a  series  of  studies  by  Greer, 
Smith,  and  Hatfield  (  jJ,  attempts  were  made  to  develop  more  objective  and  reliable 
means  of  evaluating  student  performance  in  helicopters.  Initial  investigation  indicated  that 
the  traditional  subjective  grading  system  then  in  use  had  quite  low  reliability,  a  finding 
consonant  with  those  reported  in  earlier  summaries  of  research  on  reliability  of  subjective 
checkrides  (e.g.,  see  Erickson  2,  and  Ben-Avi,  3).  Correlations  of  daily  training  grades  and 
flight  check  grades  during  rotary  wing  primary  training  were  typically  less  than  .30,  while 
those  between  checkrides  given  at  various  points  in  training  were  as  low  or  lower. 

Building  on  previous  Air  Force  work  by  Smith,  Flexman,  and  Houston  (4),  Greer  et 
al ,  (1)  developed  a  series  of  relatively  objective  flight  performance  checklists  called  Pilot 
Performance  Description  Records  (PPDR).  In  constructing  the  PPDR,  each  maneuver  was 
analyzed  in  detail,  and  as  many  items  or  scales  as  possible  describing  specific  pilot  and 
aircraft  behaviors  during  the  maneuver  were  developed.  Where  feasible,  objective  indices 
were  used,  such  as  airspeed,  altitude,  and  RPM  indications.  In  each  case,  the  item  and 
conditions  for  its  observation  were  carefully  defined.  Figure  1  illustrates  a  page  from  a 
rotary  wing  PPDR.  Similar  instruments  have  been  developed  by  Prophet  and  Jolley  (5_) 
for  use  in  fixed  wing  flight  measurement. 
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After  much  research  effort,  the  PPDR  was  installed  as  un  integral  part  of  the  flight 
evaluation  program  at  the  Primary  Helicopter  School.  Change  from  an  existing  subjective 
flight  evaluation  system  to  a  new,  albeit  more  objective,  system  such  as  the  PPDR  does 
not  come  easily.  Without  the  support  of  top  management  such  a  change  could  not  have 
been  made.  Note  should  be  taken  of  the  utter  necessity  for  systematic  and  thorough 
training  of  the  checkpilots  in  the  use  of  the  new  instrument  before  they  try  it  opera¬ 
tionally  with  real  students. 

Before-and-after  results  of  the  use  of  the  PPDR.  are  shown  in  Table  1.  These  data 
show  correlations  between  mean  daily  flight  grade  for  a  phase  of  training  with  the  grade 
on  the  checkride  administered  at  the  end  of  that  phase.  The  subjective  system  did  not 
yield  statistically  significant  correlations,  while  grades  derived  from  the  PPDR  correlated 
significantly  with  training  grades. 


Table  1 


Correlation  of  Mean  Daily  Grade  and 
Checkride  Grade  by  Stage  of 
Primary  Training 


Syitem 

Stage  of  Training 

Intermediate 

|  Advanced 

Subjective 

.08 

.09 

Objective  (PPDR) 

.42* 

.51* 

*p<.05. 


The  PPDR  has  been  used  to  provide  a  greater  degree  of  standardization  and 
objectivity  to  the  flight  evaluation  process.  Because  of  the  considerable  detail  in  which  it 
describes  the  desired  or  proper  performance  of  a  maneuver,  the  PPDR  is  quite  useful  as  a 
pedagogical  tool.  First,  it  conveys  to  the  student  rather  precisely  the  performance 
objectives  he  seeks  to  achieve  and  the  items  on  which  he  will  be  evaluated.  Typically,  for 
a  checkride  of  an  hour’s  duration,  approximately  250  separate  behavioral  observations  are 
recorded  on  the  PPDR  by  the  checkpilot.  The  PPDR  also  standardizes  and  defines  for  the 
student  and  the  checkpilot,  the  sequence  of  events  on  the  checkride.  A  second  major  use 
of  the  PPDR  is  for  detailed  postflight  feedback  to  the  student  on  his  performance.  This 
feature  of  the  PPDR  has  been  found  to  be  very  useful. 

These  uses  of  the  PPDR  are  examples  of  research  applications  to  problems  of 
teaching  and  evaluating  the  individual  student.  However,  we  have  extended  this  approach 
to  problems  that  are  more  systemic  in  nature,  through  use  of  the  PPDR  as  part  of  a 
training  quality  control  system.  The  individual  performance  items  or  scales  are  used  as 
input  to  automatic  data  processing  by  which  the  performances  of  large  groups  of 
students,  for  example,  a  flight  class,  can  be  summarized.  These  performances  may  then  be 
compared  with  a  school  standard  (Figure  2).  Probability  tables  were  developed  to  allow 
evaluation  of  the  statistical  significance  of  a  given  maneuver’s  variation  from  its  school 
standard,  based  upon  the  number  of  cases  involved,  and  the  intramaneuver  performance 
variability. 

The  analysis  is  carried  a  step  further  in  Figure  3  in  which  individual  critical 
maneuvers  are  examined.  In  this  way,  causes  of  deviant  performance  can  be  examined  at 
a  more  detailed  level,  and  specific  remedies  can  be  developed. 

Data  summaries  of  this  sort  provide  management  with  a  powerful  tool  to  use  in 
evaluating  and  adjusting  the  training  system.  The  advantage  of  such  data  is  that  they  are 
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Evaluation  of  Class  Performance  (Errors  vs.  Maneuvers) 


performance  observations  rather  than  performance  evaluations.  The  functioning  of 
instructors  and  chcckpilots,  both  by  groupsiand  individually,  can  also  be  examined  in  this 
i way.  and  corrective  actions  taken  if  the  data  indicate  shortcomings  of  either  the 
instruction  or  evaluation  systems.  Details  of,  this  flight  training  quality  control  system  are 
presented  in  the  work  of  Duffy  and  Colgan  (6).  Implications  of  this  approach  for  other 
training  systems  have  be''"  discussed  in  other  HumRRO’  publications  by  Smith  (7,  8). 


Class  Parforfoance  pn  Critical  Maneuvers 
(Error  vs.  Items! 


This  application  of  the  quality  control  concept  might  also  be  described  under  the 
accountability  concept  that  has  come  into  prominence  recently  in  education  circles.  The 
instructor  is  held  accountable,  so  to  Speak,  for  the  quality  of  his  output— that  is,  the 
performance  of  his  students.  The  detailed  PPDR  data  summary  for  a  number  of  students 
of  a  given  instructor  provides  specific  means  of  ’counseling  with  that  instructor  and  for 
adjusting  and1  standardizing  his  instruction  as  necessary. 

This  means  of  evaluating  the  flight;  instructor  assumes  a  random  assignment  of 
!  students  to  instructors  so  that  student  aptitude  differences  do  not  account  for  instructor 

‘i  i  1 
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output  differences.  In  more  recent  work,  we  have  been  seeking  to  control  nonrandom, 
interstudent  difference  effects  through  use  of  multiple  regression  predictions  of  student 
performance  in  flight  training.  The  discrepancies  between  the  actual  performances  of  all 
students  of  n  given  instructor  and  those  performances  predicted  from  multiple  regression 
equations  will  provide  a  more  precise  measure  of  the  instructor’s  effect  on  his  students 
than  discrepancies  based  on  the  random  assignment  model. 

In  another  series  of  studies,  Caro  (9)  has  investigated  the  effects  of  prior  knowledge 
on  checkridc  evaluations.  In  one  portion  of  advanced  helicopter  training,  checkrides  are 
administered  by  instructor  pilots  from  within  the  same  instructional  flight.  The  instruc¬ 
tors  simply  trade  students  at  checkride  time,  giving  the  checkpilot  (instvuctor)  an 
opportunity  to  learn  something  about  the  prior  performance  of  the  student  before  he 
administers  the  checkride.  At  the  least,  he  knows  who  the  student’s  instructor  is  and 
probably  has  ideas  on  what  kind  of  student  that  instructor  usually  turns  out. 

In  several  classes  we  brought  in  qualified  checkpilots  from  outside  the  instructing 
flight  to  administer  checkrides  to  a  portion  of  the  class.  The  remainder  of  the  class  was 
administered  checkrides  in  the  usual  manner  by  instructors  from  within  the  flight.  The 
correlations  between  instructor  evaluation  and  checkride  grade  for  these  two  conditions 
are  contrasted  in  Table  2.  Those  checkrides  involving  no  prior  information  (i.e.,  the 
“Special”  group)  showed  negligible  correlation,  whereas  those  done  from  within  the 
instructing  flight  (the  “Regular”  group)  showed  substantial  correlation.  From  these  data, 
Caro  concluded  that  prior  knowledge  of  the  student,  rather  than  similarity  of  evaluation 
standards,  may  have  accounted  for  the  higher  correlations  of  the  “Regular”  group. 

The  focus  of  research  so  far  described  is  the  standardization  process.  Aviation 
training  managers  recognize  the  need  for  standardization  and  devote  much  effort  to  its 
achievement.  In  spite  of  this  extensive  effort,  it  is  difficult,  and  perhaps  impossible,  to 
achieve  a  substantial  degree  of  standardization  using  highly  subjective  measures.  These  are 
matters  of  considerable  concern  in  military  flight  training  programs.  For  example,  Figure 
4  shows  mean  checkride  grade  and  ±1  standard  deviation  range  for  checkride  grades  given 
by  17  checkpilots.  The  interrater  differences  are  considerable.  Analysis  of  variance  shows 
these  differences  to  be  statistically  significant  (p  <  .001).  Such  variation  is  obviously  not 
desirable,  but  it  seems  to  be  an  inevitable  part  of  subjective  evaluation. 

Checkpilot  variation  among  seven  checkpilots  at  Fort  Wolters  is  shown  in  Figure  5. 
Four  of  these  were  then  given  training  in  the  use  of  the  PPDR.  The  effect  of  this  training 
on  their  relative  standardization  is  shown  in  Figure  6. 


Table  2 

Correlations  Between  Instructor  and  Checkpilot  Evaluations3 


Stage  of  Training 

Checkride 

Pre-Solo 

Advanced 

Instrument  Cross-Country 

N 

Correlation 

N 

Correlation 

N 

Correlation 

Special 

36 

.28 

40 

.20 

44 

.18 

Regular 

24 

.55b 

20 

.73b 

18 

,S4b 

aClasses  63-1W  and  63-3. 

bCoefficiencfes  significantly  greater  than  zero  (p<.01). 
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Mean  and  ±  1  SD  Range  for  Checkride  Grades 
Assigned  by  17  Checkpilots 


■ 


ABODE  FGH  I  J  KLMNOP$ 

Checkpilot 

Figure  4 


All 

Checkpilots 

A-Q 


Greer  et  al  (1)  report  that  inter-checkpilot  flight-check  agreement  increases  as  a 
function  of  their  degree  of  similarity  in  classroom  evaluations  of  already  marked  PPDRs. 
For  example,  checkpilots  whose  evaluations  during  classroom  training  correlated  between 
.95  and  .99  showed  flight  interobserver,  or  test-retest,  correlations  of  .70  for  the 
intermediate  level  check,  whereas  those  unselected  on  the  basis  of  their  classroom 
agreement  showed  flight  correlation  of  only  .42.  Similar  data  for  the  advanced  checkride 
showed  correlations  of  .61  and  .52  for  the  classroom— similar  and  unselected  groups. 
Thus,  use  of  the  PPDR  or  similar  techniques  offers  an  indirect  means  of  increasing 
checkpilot  standardization. 

More  recently,  our  work  has  been  concerned  with  multiple  regression  approaches  to 
predicting  student  performance.  In  this  effort,  described  by  Boyles  and  Wahlberg  (10),  a 
computerized  data  bank  was  developed  for  predicting  a  variety  of  aviator  performances. 
Our  system,  which  owes  much  to  the  conceptions  so  ably  developed  by  Miss  Ambler  and 
her  colleagues  at  Pensacola  (e.g.,  see  Schoenberger,  Wherry,  and  Berkshire,  11),  presently 
contains  over  100  predictor  variables.  Included  are  variables  such  as  aptitude  and  ability 
measures,  demographic  data,  education,  academic  grades,  and  daily  and  checkride  flight 
grades. 

Data  are  in  the  computer  for  over  12,000  students  now,  with  several  thousand  more 
records  in  partial  stages  of  completion.  We  are  building  toward  not  only  the  prediction  of 
training  performance,  but  prediction  of  a  variety  of  operational  flight  performances  such 
as  combat,  flight  safety,  and  instructor  effectiveness.  We  view  the  data  bank  as  a 
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longitudinal  one.  Our  biggest  need  now  is  for  predictor  variables  to  uceounl  for  aspects  of 
operational  performance  variance  independent  or  training  performance,  thut  is,  motiva¬ 
tional  factors. 

There  are  several  questions  relating  to  the  quality  or  kind  of  data  from  which 
multiple  predictions  are  made  that  are  of  interest  here.  In  a  study  of  the  use  of  a  captive 
helicopter  as  a  training  device,  Cnro,  Isley,  and  Jolley  (12)  report  data  concerning  the 
predictability  of  subsequent  flight  performance  from  performance  on  the  device. 

They  gathered  60  separate  objective  measures  of  performance  on  the  captive  heli¬ 
copter  device  during  a  preflight  device  training  program  either  3  1/4  or  7  1/4  hours  in 
duration.  These  measures  were  then  correlated  with  mean  daily  flight  grade,  time  to 
checkride,  and  checkride  grade  for  the  pre-solo,  intermediate,  and  advanced  stages  of 
primary  training. 

Maximum  correlations  with  the  three  pre-solo  stage  criterion  measures  were  shown 
by  certain  device  measures  reflecting  cumulative  time  to  achieve  basic  hovering  control  of 
the  device;  these  correlations  ranged  from  .52  to  .60.  At  the  intermediate  stage  (i.e.,  the 
first  50  hours),  these  same  measures,  plus  a  measure  of  lateral  right  tracking  error  and 
one  of  turn  rate  during  right  turns,  showed  the  maximum  correlation  with  flight 
performance;  correlations  ranged  from  .38  to  .46.  At  the  advanced  stage  (i.e.,  the 
100-hoar  level),  several  measures  of  precision  hovering,  which  involved  maintaining  a 
probe  attached  to  the  front  of  the  device  inside  either  a  10-inch  or  14-inch  hoop  without 
touching  it,  showed  the  highest  correlations;  values  ranged  from  .44  to  .52. 

Considering  the  time  lapse  between  the  device  training  and  the  advanced  checkride 
(over  four  months)  and  the  previous  comments  on  inter-checkride  correlation,  these  latter 
relationships  are  quite  high. 

The  precision  hover  task  involving  the  hoop  and  probe  is  particularly  interesting. 
Students  were  able  to  master  this  task  relatively  easily  on  the  device  and  to  perform  it 
quite  proficiently.  However,  expert  helicopter  pilots  had  great  difficulty  with  this 
particular  task,  eve  i  though  they  could  hover  the  device  well.  Their  difficulty  stemmed 
from  their  inability  to  use  the  visual  cue  sources  'or  the  hoop  and  probe  so  close  to  their 
eyes  (about  six  feet).  Experienced  pilots  gather  their  hovering  information  from  more 
distant  sources.  It  is  interesting  that  this  artificial  task  put  in  for  training  purposes  only, 
one  that  lacked  face  validity  in  the  eyes  of  experienced  pilots,  was  one  of  the  more 
effective  for  predicting  performance  at  all  stages  of  training  and  was  the  most  effective 
for  predicting  advanced  performance. 

These  data  would  suggest  that  early  flight  performance— for  the  student’s  perform¬ 
ance  on  the  device  can  be  considered  early  flight  performance- should  be  predictive  of 
subsequent  flight  performance.  However,  data  from  our  multiple1  prediction  study  show 
that  the  first  five  graded  helicopter  flights  correlate  only  .32  with  subsequent  pass-fail,  a  / 
correlation  similar  to  that  reported  by  Schoenberger  et  al  (11 )  for  presolo  grades 
Pensacola. 

In  contrast,  an  earlier  study  of  fixed  wing  training  by  Prophet  and  Jolley  (5)r!SlniOwed 
substantial  correlation  between  early  flight  performance  and  subsequent  surufess  in  the 
program.  A  score  based  on  the  sum  of  errors  made  on  seven  selected  f  1  i  Ph  Tin  an  eu  ve  rs  on 
the  first  three  days  of  flight  showed  product-moment  correlation  o£rf50  with  checkride 
performance  at  the  35-hour  level.  Inclusion  of  the  first  five  day^|arised  the  correlation  to 
.63.  Biserial  correlations  of  these  errors  with  pass-fail  were„G2  ar^l  -76  for  the  three-  and 
five-day  periods,  respectively.  X  f 

These  three  sets  of  data  present  contrasting  results  on  the'  predictability  of  subse¬ 
quent  flight  performance  from  measures  of  early  or  preflight  performance.  Those  early  or 
preflight  measures  which  showed  substant/^f  correlation  with  later  flight  performance 
were  based  upon  objective  or  relatively- objective  indices.  The  predictor  measures  of  Caro 
et  al  (12)  were  based  upon  time  to  criterion,  frequency  counts,  time  measures,  and 


similar  data.  Those  of  Prophet  and  Jolley  (B)  were  based  upon  a  PPDR-like  daily  flight 
record  on  which  specific  performances  were  noted  for  such  indices  as  altitude  and 
airspeed.  In  contrust,  the  data  reported  in  our  multiple  regression  study  and  by  the  Navy 
are  based  on  subjective  grades  of  daily  flight  performance,  (above  average,  average,  below 
average,  and  unsatisfactory).  Thus,  flight  performance  may  be  reliably  predicted  only  if 
the  proper  kinds  of  data  are  gathered.  No  only  do  these  observations  have  implications 
for  the  kinds  of  data  that  should  be  provided  by  checkpilots  and  instructors,  but  they 
suggest  that  the  area  of  psychomotor  selection  testing  is  ground  that  needs  replowing. 

These  points  are  illustrated  in  Table  3.  These  are  the  same  fixed  wing  dat..  (Prophet 
and  Jolley,  5)  that  produced  correlations  of  .62  and  .76  with  pass-fail  for  three  and  five 
days  respectively.  It  can  be  seen  that  most  of  the  successful  students  were  different  from 
the  washouts  from  the  very  first  day  of  training.  Also,  note  that  the  washouts  show 
practically  no  improvement  in  performance  over  the  entire  five  days.  This  suggests  the 
possibility  that  the  initial  selection  screen  let  through  a  number  of  students  who  should 
not  have  entered  training.  Perhaps  a  good  psychomotor  test  might  have  picked  th  m  p. 

These  same  data  may  also  indicate  that  our  training  is  grossly  mappropnate  for 
substantial  segments  of  our  input  population.  The  results  of  Caro 
suooort  this  hypothesis,  for  they  found  a  significant  reduction  m  flight  deficiency 
attrition  as  a  result  of  the  preflight  device  training.  Perhaps,  we  should  ^  whether  our 
training  systems  possess  sufficient  flexibility  to  individualize  instruction  to  meet  the 
needs  of  these  students  who  have  difficulty,  seemingly,  ^rom  the  beginning  of  the 
oroeram  The  captive  helicopter  provided  the  students  of  C  aro  et  al.,  a  relative  y 
S  onJrorment  in  which  to  learn  certain  skills  and  to  develop  ■ ~«ftderoe  in 
themselves  It  was  also  unique  in  that  the  students  received  full,  immediate,  and  often 
emphatic  feedback  concerning  the  results  of  their  control  actions.  While i  this  is 
peripheral  to  the  main  subject,  there  appears  to  be  a  crying  need  for  definitive  research 

on  what  flight  students  do  or  don’t  learn  and  why.  n„rfnrm. 

a„Mr«rn?Knsho1  wSSS-E  eed  for 

time-lapse  photographic  techniques  to  gather  flight  data  (Isley  and  Caro  13).  How 


Table  3 


Percent  Error  for  Seven  Selected  Manuevers 
by  Day  of  Training  and  35-Hour  Check  Grade 


Group 

Based  on  35-Hour  Check 


Pre-35-hour  washouts 
35-hour  washouts 
35-hour  grade=70-74 
35-hour  grade =75-79 
35-hour  grade=80-84 
35-hour  grade=85-89 
35-hour  grade =90-94 


Percent  Error 

Training  Day 

n 

2 

3 

4  1 

5 

63 

62 

60 

57 

57 

60 

51 

58 

45 

55 

66 

54 

48 

34 

35 

42 

40 

36 

30 

40 

50 

35 

34 

32 

23 

49 

40 

29 

28 

22 

45 

35 

29 

22 

24 
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data  reduction  la  time  consuming.  Airborne  videotape  techniques  seem  to  offer  promise, 
as  do  other  airborne  data  recorders.  We  are  following  the  work  of  the  Air  Force  in  this 
area  with  considerable  interest. 

There  seems  little  doubt  that  future  major  gains  in  the  effectiveness  and  efficiency 
of  flight-proficiency  measurement  techniques  will  involve  forms  of  automated  measure¬ 
ment.  This  is  particularly  relevant  for  the  basic  perceptual-motor  control  skill  areas. 
However,  we  must  not  lose  sight  of  the  fact  that  operational  flying  involves  complex 
decision  making  and  cognitive  factors  overlaid  on  these  control  skills.  This  is  the  real 
challenge  for  measurement  research  in  aviation.  1  am  not  convinced  that  these  complex, 
mission-oriented  factors  can  be  sensed,  transduced,  and  then  recorded  adequately  by 
hardware. 

The  only  real  application  of  automated  performance-measurement  techniques  in 
helicopter  training  is  in  the  Army’s  Synthetic  Flight  Training  System  (SFTS)  currently 
undergoing  test  at  Fort  Rucker.  This  system  has  the  capability  of  automatically 
administering  training,  recording  and  evaluating  trainee  performance,  adapting  problem 
difficulty  level  to  manifest  performance,  and  sequencing  the  trainee  to  the  next  step  in 
the  training  program.  The  lack  of  hard  data  on  how  trainees  actually  perform  in 
maintaining  various  flight  parameters  within  tolerance  envelopes  and  the  manner  in  which 
these  envelopes  change  over  time  makes  automatic  measurement  difficult.  However, 
shortly  we  should  be  able  to  develop  a  much  more  complete  and  valid  picture  of  training 
performance  as  we  work  with  this  device.  We  also  intend  to  explore  quality  control 
applications  with  the  SFTS  equipment,  both  in  the  school  training  situation  and  in 
operational  helicopter  units. 

In  summary,  the  Army  has  made  progress  in  its  flight-measurement  programs  over 
the  past  15  years.  The  PPDR  system  is  the  most  objective  and  detailed  flight  performance 
measurement  system  in  operational  use  in  a  military  flight  training  program.  The  applica¬ 
tions  of  quality  control  techniques  in  the  Army  represent  substantial  advancement,  both 
at  the  line  instructional  level  and  at  the  training  system  management  level.  However,  we 
have  a  long  way  to  go  before  the  students  no  longer  perceive  that  their  fate  is  largely  in 
the  sometimes  capricious  hands  of  the  “Santa  Clauses”  and  “Hardnoses”  who  check 
them. 
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